How-To Guides

Backend Setup

First, set up a Python virtual environment (e.g. conda or venv) and install the dependencies.

With venv:

cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

In order to execute the code generated by the science agent, you need docker installed on your system. If you are using a Linux system please note that by default docker requires root access to run. You can either add your user to the docker group, run the backend with root privileges, or configure docker in rootless mode.

To build the docker image for running generated programs, run the following command. Note that the USER_UID and USER_GID build arguments are used to set the user inside the docker container to match your host user, which is necessary to avoid permission issues on Linux systems:

cd backend/sci_agent_docker
docker build --build-arg USER_UID=$(id -u) --build-arg USER_GID=$(id -g) -t science-agent .

Before running the backend, see the Configuration section to set up your LLM and storage provider.

The backend uses the Quart library, which is a Python asynchronous web framework. To start the dev server, run the following:

cd backend
quart run

Frontend Setup

First make sure Node and NPM are installed. Then run the following commands in your terminal to install the necessary dependencies:

cd frontend
npm install

To make the frontend available locally, run the following command to build and serve the frontend at http://localhost:5173:

npm run dev

Configuration

Backend

Configuration properties for the backend are stored in backend/config.py. In order to run the backend locally, the only required configuration is to set up your default LLM provider, which is done by modifying the LLM_ENGINE_NAME, LLM_BASE_URL, and LLM_REGION_NAME variables. Any provider supported by LiteLLM is supported. If your provider requires any credentials or keys, you will also need to set the LLM_API_KEY environment variable with the appropriate API key. Note that if no LLM engine is specified in the config file, the user will be required to provide one in the interface.

By default, sessions and task-related files are stored on the local filesystem to ease testing and development. However, for production use, you may configure AWS S3 for file storage and DynamoDB for session data by setting the STORAGE_BACKEND, S3_BUCKET, and AGENT_SESSION_BACKEND variables. See the comments in config.py for more details.

To make use of AWS Bedrock, S3, and DynamoDB you must also configure your AWS credentials. You need ~/.aws/credentials with the following properties at minimum:

[default]
aws_access_key_id = {YOUR_AWS_ID}
aws_secret_access_key = {YOUR_AWS_KEY}

Alternatively, use the aws-cli to set up your local config. e.g.

aws configure

Frontend

There are two configuration files in the frontend directory: .env.development and .env.production. The development file is used when running the frontend locally, while the production file is used when building the frontend for deployment.

The default values suffice for running locally, but if you are using S3 for file storage you will need to configure the follow property:

VITE_STATIC_FILE_BASE_URL=https://{your-s3-bucket-name}.s3.amazonaws.com

Deployment

To deploy the backend, the above instructions for running locally also apply, except you should use a production-ready ASGI server such as hypercorn instead of the Quart dev server. See the official Quart documentation for more details on deploying Quart applications.

To deploy the frontend, open .env.production in the frontend directory and set the URL or server IP address where your backend is hosted:

VITE_API_BASE_URL=https://{your-backend-url}

Next, build the production version of the frontend by running:

npm run build

This will create a dist directory containing the production build of the frontend. To serve the production build, you can serve the files in the dist directory using any static file server.

Example Tasks

To load all the tasks from the ScienceAgentBench dataset into the backend and make them available for usage, download the dataset into the backend/benchmark directory and run the following script:

cd backend
python preload_benchmark.py

Backend Setup​

Frontend Setup​

Configuration​

Backend​

Frontend​

Deployment​

Example Tasks​